Parsing Algorithms and Metrics
نویسنده
چکیده
Many different metrics exist for evaluating parsing results, including Viterbi, Crossing Brackets Rate, Zero Crossing Brackets Rate, and several others. However, most parsing algorithms, including the Viterbi algorithm, attempt to optimize the same metric, namely the probability of getting the correct labelled tree. By choosing a parsing algorithm appropriate for the evaluation metric, better performance can be achieved. We present two new algorithms: the "Labelled Recall Algorithm," which maximizes the expected Labelled Recall Rate, and the "Bracketed Recall Algorithm," which maximizes the Bracketed Recall Rate. Experimental results are given, showing that the two new algorithms have improved performance over the Viterbi algorithm on many criteria, especially the ones that they optimize. 1 I n t r o d u c t i o n In corpus-based approaches to parsing, one is given a treebank (a collection of text annotated with the "correct" parse tree) and attempts to find algorithms that, given unlabelled text from the treebank, produce as similar a parse as possible to the one in the treebank. Various methods can be used for finding these parses. Some of the most common involve inducing Probabilistic Context-Free Grammars (PCFGs), and then parsing with an algorithm such as the Labelled Tree (Viterbi) Algorithm, which maximizes the probability that the output of the parser (the "guessed" tree) is the one that the PCFG produced. This implicitly assumes that the induced PCFG does a good job modeling the corpus. There are many different ways to evaluate these parses. The most common include the Labelled Tree Rate (also called the Viterbi Criterion or Exact Match Rate), Consistent Brackets Recall Rate (also called the Crossing Brackets Rate), Consistent Brackets Tree Rate (also called the Zero Crossing Brackets Rate), and Precision and Recall. Despite the variety of evaluation metrics, nearly all researchers use algorithms that maximize performance on the Labelled Tree Rate, even in domains where they are evaluating using other criteria. We propose that by creating algorithms that optimize the evaluation criterion, rather than some related criterion, improved performance can be achieved. In Section 2, we define most of the evaluation metrics used in this paper and discuss previous approaches. Then, in Section 3, we discuss the Labelled Recall Algorithm, a new algorithm that maximizes performance on the Labelled Recall Rate. In Section 4, we discuss another new algorithm, the Bracketed Recall Algorithm, that maximizes performance on the Bracketed Recall Rate (closely related to the Consistent Brackets Recall Rate). Finally, we give experimental results in Section 5 using these two algorithms in appropriate domains, and compare them to the Labelled Tree (Viterbi) Algorithm, showing that each algorithm generally works best when evaluated on the criterion that it optimizes. 2 E v a l u a t i o n M e t r i c s In this section, we first define basic terms and symbols. Next, we define the different metrics used in evaluation. Finally, we discuss the relationship of these metrics to parsing algorithms. 2.1 Basic Definitions Let Wa denote word a of the sentence under consideration. Let w b denote WaW~+l...Wb-lWb; in particular let w~ denote the entire sequence of terminals (words) in the sentence under consideration. In this paper we assume all guessed parse trees are binary branching. Let a parse tree T be defined as a set of triples (s, t, X)--where s denotes the position of the first symbol in a constituent, t denotes the position of the last symbol, and X represents a terminal or nonterminal symbol--meeting the following three requirements:
منابع مشابه
بررسی مقایسهای تأثیر برچسبزنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی
In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...
متن کاملتأثیر ساختواژهها در تجزیه وابستگی زبان فارسی
Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...
متن کاملTime as a Measure of Parsing Efficiency
Charniak and his colleagues have proposed implementation-independent metrics as a way of comparing the efficiency of parsing algorithms implemented on different platforms, in different languages, and with different degrees of “incidental optimization”. We argue that there are easily immaginable circumstances in which their proposed metrics would mask significant differences in efficiency; we po...
متن کاملSpectral Unsupervised Parsing with Additive Tree Metrics
We propose a spectral approach for unsupervised constituent parsing that comes with theoretical guarantees on latent structure recovery. Our approach is grammarless – we directly learn the bracketing structure of a given sentence without using a grammar model. The main algorithm is based on lifting the concept of additive tree metrics for structure learning of latent trees in the phylogenetic a...
متن کاملOn Maximizing Metrics for Syntactic Disambiguation
Given a probabilistic parsing model and an evaluation metric for scoring the match between parse-trees, e.g., PARSEVAL [Black et al., 1991], this paper addresses the problem of how to select the on average best scoring parse-tree for an input sentence. Common wisdom dictates that it is optimal to select the parse with the highest probability, regardless of the evaluation metric. In contrast, th...
متن کامل